Exploiting Discriminative Point Process Models for Spoken Term Detection

نویسندگان

Atta Norouzian

Aren Jansen

Richard C. Rose

Samuel Thomas

چکیده

State-of-the-art spoken term detection (STD) systems are built on top of large vocabulary speech recognition engines, which generate lattices that encode candidate occurrences of each invocabulary query. These lattices specifiy start and stop times of hypothesized term occurrences, providing a clear opportunity to return to the acoustics to incorporate novel confidence measures for verification. In this paper, we introduce a novel exemplar distance metric to the recently proposed discriminative point process modeling (DPPM) framework and use the resulting whole word models to generate STD confidence scores. In doing so, we introduce STD to a completely distinct acoustic modeling pipeline, trading Gaussian mixture models (GMM) for multi-layer perceptrons and replacing dictionary-derived hidden Markov models (HMM) with exemplar-based point process models. We find that whole word DPPM scores both perform comparably and are complementary to lattice posterior scores produced by a state-of-the-art speech recognition engine.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback

In a previous paper [1], we proposed a new framework for spoken term detection by exploiting user relevance feedback information to estimate better acoustic model parameters to be used in rescoring the spoken segments. In this way, the acoustic models can be trained with a criterion of better retrieval performance, and the retrieval performance can be less dependent on the existence of a set of...

متن کامل

Out-of-Vocabulary Spoken Term Detection

Spoken term detection (STD) is a fundamental task for multimedia information retrieval. A major challenge faced by an STD system is the serious performance reduction when detecting out-of-vocabulary (OOV) terms. The difficulties arise not only from the absence of pronunciations for such terms in the system dictionaries, but from intrinsic uncertainty in pronunciations, significant diversity in ...

متن کامل

Augmented set of features for confidence estimation in spoken term detection

Discriminative confidence estimation along with confidence normalisation have been shown to construct robust decision maker modules in spoken term detection (STD) systems. Discriminative confidence estimation, making use of termdependent features, has been shown to improve the widely used lattice-based confidence estimation in STD. In this work, we augment the set of these term-dependent featur...

متن کامل

Discriminative spoken term detection with limited data

We study spoken term detection—the task of determining whether and where a given word or phrase appears in a given segment of speech—in the setting of limited training data. This setting is becoming increasingly important as interest grows in porting spoken term detection to multiple lowresource languages and acoustic environments. We propose a discriminative algorithm that aims at maximizing t...

متن کامل

Feature analysis for discriminative confidence estimation in spoken term detection

Discriminative confidence based on multi-layer perceptrons (MLPs) and multiple features has shown significant advantage compared to the widely used lattice-based confidence in spoken term detection (STD). Although the MLP-based framework can handle any features derived from a multitude of sources, choosing all possible features may lead to over complex models and hence less generality. In this ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Exploiting Discriminative Point Process Models for Spoken Term Detection

نویسندگان

چکیده

منابع مشابه

Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback

Out-of-Vocabulary Spoken Term Detection

Augmented set of features for confidence estimation in spoken term detection

Discriminative spoken term detection with limited data

Feature analysis for discriminative confidence estimation in spoken term detection

عنوان ژورنال:

اشتراک گذاری